Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid generated column names in conversion events #327

Merged

Conversation

jerome-laurent-pro
Copy link
Contributor

@jerome-laurent-pro jerome-laurent-pro commented Jun 4, 2024

Description & motivation

I encountered an issue with non standard event names that I wanted to use as conversion events.
Stuff like 50%_completion or condition-click.
I can't change their names in GA4, so I opened this PR to propose a way to create clean column names when they are used to generate conversion events related models.
Currently, using the conversion_events variable with these events will result in errors as columns named like count_condition-click will be generated.

I found this issue that was related (#211)

Checklist

  • I have verified that these changes work locally
    • I'm using this exact same function but I'm trying to migrate to a more robust framework
  • I have updated the README.md (if applicable)
    • I don't think there's a need?
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have run dbt test and python -m pytest . to validate existing tests

@jerome-laurent-pro jerome-laurent-pro marked this pull request as draft June 4, 2024 12:47
@jerome-laurent-pro jerome-laurent-pro changed the title Add tests Fix invalid generated column names in conversion events Jun 4, 2024
@adamribaudo-velir
Copy link
Collaborator

Great! Lmk if you need help finalizing the tests. Eventually we'll refactor to use dbt's newly released unit test functionality.

Comment on lines 26 to 31
# Update project name to ga4 so we can call macros with ga4.macro_name
@pytest.fixture(scope="class")
def project_config_update(self):
return {
"name": "ga4"
}
Copy link
Contributor Author

@jerome-laurent-pro jerome-laurent-pro Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought adding this element like in the other tests on models using macros was going to fix the issue but it doesn't.

Lmk if you need help finalizing the tests

I'd be happy to have your help, @adamribaudo-velir

@jerome-laurent-pro jerome-laurent-pro marked this pull request as ready for review June 4, 2024 15:18
@adamribaudo-velir
Copy link
Collaborator

@jerome-laurent-pro check out the latest commit. You just needed to add the macro to unit tests that call the macro.

The tests still fail, but for a valid reason: What value is a user expected to enter into the conversion_events dbt variable? Either they need to enter the re-formatted event name or you'll need to update this line of code in stg_ga4__page_conversions and stg_ga4__session_conversions_daily:

{% for ce in var('conversion_events',[]) %}
    , countif(event_name = '{{ce}}') as {{ga4.valid_column_name(ce)}}_count
    {% endfor %}

Notice that the countif statement uses the original event name.

@jerome-laurent-pro
Copy link
Contributor Author

jerome-laurent-pro commented Jun 5, 2024

You just needed to add the macro to unit tests that call the macro.

Thank you!

Either they need to enter the re-formatted event name or you'll need to update this line of code

The idea was to use the official event_name from GA4 as it'll be named like this in the raw data. Users shouldn't have to guess the future safe name while entering the variable in the project's yaml.
So countif(event_name = '{{ce}}') will still count the events, but the new column will have a safe name (preventing dbt run from failing due to an incorrectly named column in the generated SQL)

The modification you did in the test was creating an error, because the safe column name for my-page-view is my_page_view_count. The aggregated expectation in the test was still named page_view_count, hence the failure in the test.
Unrecognized name: my_page_view_count; Did you mean page_view_count? at [7:21]
Using page-view works just fine.

In the other test, it seems that the project_config_update was erasing the static_incremental_days variable set in conftest.py.

return {"name": "ga4", "vars": {"static_incremental_days": 3}}

Adding it again fix the error. It seems like conftest isn't really working as I expected, otherwise I assume we wouldn't have had to do a project_config_update in each test.


@adamribaudo-velir, let me know if you think the way I implemented the functionality doesn't make sense or if you want me to add anything else.

@adamribaudo-velir
Copy link
Collaborator

Ah! Thanks, I think I misunderstood the change a bit. And sorry for inadvertently breaking the test while I fixed it 😓

Let me run through an example on my end, but I expect to merge this by end of week.

@adamribaudo-velir
Copy link
Collaborator

Confirmed this works as expected

image

image

@adamribaudo-velir adamribaudo-velir merged commit df85449 into Velir:main Jun 5, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants